Speeding Up the Convergence of Value Iteration in Partially Observable Markov Decision Processes
نویسندگان
چکیده
Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for nding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value iteration. The method has been evaluated on an array of benchmark problems and was found to be very e ective: It enabled value iteration to converge after only a few iterations on all the test problems.
منابع مشابه
Speeding Up the Convergence of Value Iterationin Partially Observable Markov Decision
Partially observable Markov decision processes (POMDPs) have recently become popular among many AI researchers because they serve as a natural model for planning under uncertainty. Value iteration is a well-known algorithm for nding optimal policies for POMDPs. It typically takes a large number of iterations to converge. This paper proposes a method for accelerating the convergence of value ite...
متن کاملA Method for Speeding Up Value Iteration in Partially Observable Markov Decision Processes
We present a technique for speeding up the convergence of value iteration for par tially observable Markov decisions processes (POMDPs). The underlying idea is similar to that behind modified policy iteration for fully observable Markov decision processes (MDPs). The technique can be easily incor porated into any existing POMDP value it eration algorithms. Experiments have been conducted on ...
متن کاملGeneralized and bounded policy iteration for finitely-nested interactive POMDPs: scaling up
Policy iteration algorithms for partially observable Markov decision processes (POMDP) offer the benefits of quick convergence and the ability to operate directly on the solution, which usually takes the form of a finite state controller. However, the controller tends to grow quickly in size across iterations due to which its evaluation and improvement become costly. Bounded policy iteration pr...
متن کاملGeneralized and Bounded Policy Iteration for Interactive POMDPs
Policy iteration algorithms for solving partially observable Markov decision processes (POMDP) offer the benefits of quicker convergence and the ability to operate directly on the policy, which usually takes the form of a finite state controller. However, the controller tends to grow quickly in size across iterations due to which its evaluation and improvement become costly. Bounded policy iter...
متن کاملPOMDP solution methods
This is an overview of partially observable Markov decision processes (POMDPs). We describe POMDP value and policy iteration as well as gradient ascent algorithms. The emphasis is on solution methods that work directly in the space of policies.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Artif. Intell. Res.
دوره 14 شماره
صفحات -
تاریخ انتشار 2001